Speech Recognition and Synthesis

نویسنده

Douglas W. Beeks

چکیده

The application of speech recognition (SR) in aviation is rapidly evolving and moving toward more common use on future flightdecks. The concept of using SR in aviation is not new. The use of speech recognition and voice control (VC) has been researched for more than 20 years, and many of the proposed benefits have been demonstrated in varied applications. Continuing advances in computer hardware and software are making the use of voice control applications on the flightdeck more practical, flexible, and reliable. There is little argument that the easiest and most natural and ideal way for a human to interact with a computer is by direct voice input (DVI). While speech recognition has improved over the past several years, speech recognition has not reached the level of capability and reliability of one person talking to another. Using SR and DVI in a flightdeck atmosphere likely brings to mind thoughts of the computer on board the starship Enterprise from the science fiction classic Star Trek , or possibly of the HAL9000 computer from the movie 2001: A Space Odyssey. The expectation of a voice control system like the computer on the Enterprise and the HAL9000 computer, is that it be highly reliable, work in adverse and stressful conditions, be transparent to the user, and understand its users accurately without having to tailor their individual speech and vocabulary to suit the system. Current speech recognition and voice control systems are not able to achieve this level of performance expectations, although the ability and flexibility of speech recognition and its application to voice control has increased over the past few years. Whether or not a speech recognition system will ever be able to function to the level of one person speaking to another remains to be seen.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Speech Recognition and Synthesis

نویسنده

چکیده

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

Improving the performance of MFCC for Persian robust speech recognition

عنوان ژورنال:

اشتراک گذاری